27 research outputs found
Recommended from our members
Mixture Models in Machine Learning
Modeling with mixtures is a powerful method in the statistical toolkit that can be used for representing the presence of sub-populations within an overall population. In many applications ranging from financial models to genetics, a mixture model is used to fit the data. The primary difficulty in learning mixture models is that the observed data set does not identify the sub-population to which an individual observation belongs. Despite being studied for more than a century, the theoretical guarantees of mixture models remain unknown for several important settings.
In this thesis, we look at three groups of problems. The first part is aimed at estimating the parameters of a mixture of simple distributions. We ask the following question: How many samples are necessary and sufficient to learn the latent parameters? We propose several approaches for this problem that include complex analytic tools to connect statistical distances between pairs of mixtures with the characteristic function. We show sufficient sample complexity guarantees for mixtures of popular distributions (including Gaussian, Poisson and Geometric). For many distributions, our results provide the first sample complexity guarantees for parameter estimation in the corresponding mixture. Using these techniques, we also provide improved lower bounds on the Total Variation distance between Gaussian mixtures with two components and demonstrate new results in some sequence reconstruction problems.
In the second part, we study Mixtures of Sparse Linear Regressions where the goal is to learn the best set of linear relationships between the scalar responses (i.e., labels) and the explanatory variables (i.e., features). We focus on a scenario where a learner is able to choose the features to get the labels. To tackle the high dimensionality of data, we further assume that the linear maps are also sparse , i.e., have only few prominent features among many. For this setting, we devise algorithms with sub-linear (as a function of the dimension) sample complexity guarantees that are also robust to noise.
In the final part, we study Mixtures of Sparse Linear Classifiers in the same setting as above. Given a set of features and the binary labels, the objective of this task is to find a set of hyperplanes in the space of features such that for any (feature, label) pair, there exists a hyperplane in the set that justifies the mapping. We devise efficient algorithms with sub-linear sample complexity guarantees for learning the unknown hyperplanes under similar sparsity assumptions as above. To that end, we propose several novel techniques that include tensor decomposition methods and combinatorial designs
Trace Reconstruction: Generalized and Parameterized
In the beautifully simple-to-state problem of trace reconstruction, the goal is to reconstruct an unknown binary string x given random "traces" of x where each trace is generated by deleting each coordinate of x independently with probability p<1. The problem is well studied both when the unknown string is arbitrary and when it is chosen uniformly at random. For both settings, there is still an exponential gap between upper and lower sample complexity bounds and our understanding of the problem is still surprisingly limited. In this paper, we consider natural parameterizations and generalizations of this problem in an effort to attain a deeper and more comprehensive understanding. Perhaps our most surprising results are:
1) We prove that exp(O(n^(1/4) sqrt{log n})) traces suffice for reconstructing arbitrary matrices. In the matrix version of the problem, each row and column of an unknown sqrt{n} x sqrt{n} matrix is deleted independently with probability p. Our results contrasts with the best known results for sequence reconstruction where the best known upper bound is exp(O(n^(1/3))).
2) An optimal result for random matrix reconstruction: we show that Theta(log n) traces are necessary and sufficient. This is in contrast to the problem for random sequences where there is a super-logarithmic lower bound and the best known upper bound is exp({O}(log^(1/3) n)).
3) We show that exp(O(k^(1/3) log^(2/3) n)) traces suffice to reconstruct k-sparse strings, providing an improvement over the best known sequence reconstruction results when k = o(n/log^2 n).
4) We show that poly(n) traces suffice if x is k-sparse and we additionally have a "separation" promise, specifically that the indices of 1\u27s in x all differ by Omega(k log n)
Support Recovery in Universal One-Bit Compressed Sensing
One-bit compressed sensing (1bCS) is an extreme-quantized signal acquisition
method that has been widely studied in the past decade. In 1bCS, linear samples
of a high dimensional signal are quantized to only one bit per sample (sign of
the measurement). Assuming the original signal vector to be sparse, existing
results either aim to find the support of the vector, or approximate the signal
within an -ball. The focus of this paper is support recovery, which
often also computationally facilitates approximate signal recovery. A universal
measurement matrix for 1bCS refers to one set of measurements that work for all
sparse signals. With universality, it is known that 1bCS
measurements are necessary and sufficient for support recovery (where
denotes the sparsity). In this work, we show that it is possible to universally
recover the support with a small number of false positives with
measurements. If the dynamic range of the signal vector is
known, then with a different technique, this result can be improved to only
measurements. Further results on support recovery are also
provided.Comment: 15 page
Connectivity of Random Annulus Graphs and the Geometric Block Model
We provide new connectivity results for {\em vertex-random graphs} or {\em
random annulus graphs} which are significant generalizations of random
geometric graphs. Random geometric graphs (RGG) are one of the most basic
models of random graphs for spatial networks proposed by Gilbert in 1961,
shortly after the introduction of the Erd\H{o}s-R\'{en}yi random graphs. They
resemble social networks in many ways (e.g. by spontaneously creating cluster
of nodes with high modularity). The connectivity properties of RGG have been
studied since its introduction, and analyzing them has been significantly
harder than their Erd\H{o}s-R\'{en}yi counterparts due to correlated edge
formation.
Our next contribution is in using the connectivity of random annulus graphs
to provide necessary and sufficient conditions for efficient recovery of
communities for {\em the geometric block model} (GBM). The GBM is a
probabilistic model for community detection defined over an RGG in a similar
spirit as the popular {\em stochastic block model}, which is defined over an
Erd\H{o}s-R\'{en}yi random graph. The geometric block model inherits the
transitivity properties of RGGs and thus models communities better than a
stochastic block model. However, analyzing them requires fresh perspectives as
all prior tools fail due to correlation in edge formation. We provide a simple
and efficient algorithm that can recover communities in GBM exactly with high
probability in the regime of connectivity